Rasterising Aliased Lines

ABSTRACT

A method of rasterising a line in computer graphics determines whether the line&#39;s start and/or end is inside a diamond test area within the pixel. If the end is not inside and the start is inside, the pixel is drawn as part of the line. If neither the start nor the end of the line are inside, it is determined whether the line crosses more than one extended diamond edge and if so, it is further determined (i) whether an extended line passing through the start and end is substantially vertical and touches the right point of the diamond area, (ii) if the extended line touches the bottom point of the diamond area, and (iii) whether the extended line is on a same side of each point of the diamond area. If any of (i), (ii) and (iii) is positive, the pixel is drawn as part of the line.

CROSS-REFERENCE TO RELATED APPLICATIONS AND CLAIM OF PRIORITY

This application is a continuation under 35 U.S.C. 120 of copendingapplication Ser. No. 17/744,649 filed May 14, 2022, now U.S. Pat. No.11,710,263, which is a continuation of prior application Ser. No.17/132,299 filed Dec. 23, 2020, now U.S. Pat. No. 11,354,835, whichclaims foreign priority under 35 U.S.C. 119 from United KingdomApplication No. 1919153.5 filed Dec. 23, 2019, the contents of which areincorporated herein by reference in their entirety.

BACKGROUND

In computer graphics, a set of surfaces representing objects in a sceneis divided up into a number of smaller and simpler pieces, (referred toas primitives), typically triangles, which are more amenable torendering. The resulting divided surface is generally an approximationto the original surface, but the accuracy of this approximation can beimproved by increasing the number of generated primitives, which in turnusually results in the primitives being smaller. The amount ofsub-division is usually determined by a level of detail (LOD). Anincreased number of primitives is therefore typically used where ahigher level of detail is required, e.g. because an object is closer tothe viewer and/or the object has a more intricate shape. However, use oflarger numbers of triangles increases the processing effort required torender the scene and hence increases the size of the hardware thatperforms the processing. Furthermore, as the average triangle sizereduces, aliasing (e.g. when angled lines appear jagged) occurs moreoften. To address this aliasing, multisampling (i.e. taking severalsamples per pixel) may be used. Alternatively, where multisampling isnot used, line rasterization rules may be used to define how angledlines are handled and in particular to determine which pixels are usedto render the line.

As the number of primitives that are generated increases, the ability ofa graphics processing system to process the primitives becomes moreimportant. One known way of improving the efficiency of a graphicsprocessing system is to render an image in a tile-based manner. In thisway, the rendering space into which primitives are to be rendered isdivided into a plurality of tiles, which can then be renderedindependently from each other. A tile-based graphics system includes atiling unit to tile the primitives, i.e. to determine, for a primitive,which of the tiles of a rendering space the primitive is in. Then, whena rendering unit renders the tile, it can be given information (e.g. aper-tile list) indicating which primitives should be used to render thetile.

An alternative to tile-based rendering is immediate-mode rendering. Insuch systems there is no tiling unit generating per-tile lists and eachprimitive appears to be rendered immediately; however, even in suchsystems, the rendering space may still be divided into tiles of pixelsand rendering of each primitive may still be done on a tile by tilebasis with each pixel in a tile being processed before progressing tothe next tile. This is done to improve locality of memory references.

The embodiments described below are provided by way of example only andare not limiting of implementations which solve any or all of thedisadvantages of known graphics processing pipelines.

SUMMARY

This Summary is provided to introduce a selection of concepts in asimplified form that are further described below in the DetailedDescription. This Summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used to limit the scope of the claimed subject matter.

A method of rasterising a line comprises determining whether the line'sstart and/or end is inside a diamond test area within the pixel. If theend is not inside and the start is inside, the pixel is drawn as part ofthe line. If neither the start nor the end of the line are inside, it isdetermined whether the line crosses more than one extended diamond edgeand if so, it is further determined (i) whether an extended line passingthrough the start and end is substantially vertical and touches theright point of the diamond area, (ii) if the extended line touches thebottom point of the diamond area, and (iii) whether the extended line ison a same side of each point of the diamond area. If any of (i), (ii)and (iii) is positive, the pixel is drawn as part of the line.

A first aspect provides a method of rasterising a line in a graphicsprocessing pipeline, the line having a start point and an end point andthe method comprising, for each pixel in an input set of pixels:determining whether the end point and/or the start point of the line isin a diamond test area within the pixel, wherein the diamond test areais defined by a top point, a left point, a bottom point and a rightpoint connected by edges to form a diamond; in response to determiningthat the end point is not in the diamond test area and the start pointof the line is in the diamond test area, adding the pixel to a set ofpixels to be drawn as part of the line; in response to determining thatneither the start point nor the end point of the line are in the diamondtest area, determining if the line crosses more than one extendeddiamond edge, wherein an extended diamond edge is coincident with anedge of the diamond test area and extends beyond the diamond points thatthe edge connects; and in response to determining that the line crossesmore than one extended diamond edge: determining if an extended linepassing through the start and end point has a slope less than −1 orgreater than +1 and touches the right point of the diamond test area;determining if the extended line touches the bottom point of the diamondtest area; determining if the extended line is on a same side of eachpoint of the diamond test area; and in response to determining that theextended line has a slope less than −1 or greater than +1 and touchesthe right point of the diamond test area or that the extended linetouches the bottom point of the diamond test area or that the extendedline is not on a same side of each point of the diamond test area,adding the pixel to a set of pixels to be drawn as part of the line.

A second aspect provides a graphics processing pipeline comprising arasterization phase, the rasterization phase comprising hardware logicarranged to: determine whether the end point and/or the start point ofthe line is in a diamond test area within the pixel, wherein the diamondtest area defined by a top point, a left point, a bottom point and aright point connected by edges to form a diamond; in response todetermining that the end point is not in the diamond test area and thestart point of the line is in the diamond test area, to add the pixel toa set of pixels to be drawn as part of the line; in response todetermining that neither the start point nor the end point of the lineare in the diamond test area, determine if the line crosses more thanone extended diamond edge, wherein an extended diamond edge iscoincident with an edge of the diamond test area and extends beyond thediamond points that the edge connects; and in response to determiningthat the line crosses more than one extended diamond edge: to determineif an extended line passing through the start and end point has a slopeless than −1 or greater than +1 and touches the right point of thediamond test area; to determine if the extended line touches the bottompoint of the diamond test area; to determine if the extended line is ona same side of each point of the diamond test area; and in response todetermining that the extended line has a slope less than −1 or greaterthan +1 and touches the right point of the diamond test area or that theextended line touches the bottom point of the diamond test area or thatthe extended line is not on a same side of each point of the diamondtest area, to add the pixel to a set of pixels to be drawn as part ofthe line.

The graphics processing pipeline may be embodied in hardware on anintegrated circuit. There may be provided a method of manufacturing, atan integrated circuit manufacturing system, a graphics processingpipeline. There may be provided an integrated circuit definition datasetthat, when processed in an integrated circuit manufacturing system,configures the system to manufacture a graphics processing pipeline.There may be provided a non-transitory computer readable storage mediumhaving stored thereon a computer readable description of an integratedcircuit that, when processed, causes a layout processing system togenerate a circuit layout description used in an integrated circuitmanufacturing system to manufacture a graphics processing pipeline.

There may be provided an integrated circuit manufacturing systemcomprising: a non-transitory computer readable storage medium havingstored thereon a computer readable integrated circuit description thatdescribes the graphics processing pipeline; a layout processing systemconfigured to process the integrated circuit description so as togenerate a circuit layout description of an integrated circuit embodyingthe graphics processing pipeline; and an integrated circuit generationsystem configured to manufacture the graphics processing pipelineaccording to the circuit layout description.

There may be provided computer program code for performing any of themethods described herein. There may be provided non-transitory computerreadable storage medium having stored thereon computer readableinstructions that, when executed at a computer system, cause thecomputer system to perform any of the methods described herein.

The above features may be combined as appropriate, as would be apparentto a skilled person, and may be combined with any of the aspects of theexamples described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

Examples will now be described in detail with reference to theaccompanying drawings in which:

FIGS. 1A and 1B are schematic diagrams illustrating a diamond-shapedtest area used in a line rasterization rule;

FIG. 2 is a schematic diagram of an example graphics processing unit(GPU) pipeline;

FIG. 3 is a flow diagram of a method of line rasterization which may beimplemented by the line rasterization hardware shown in FIG. 2 ;

FIG. 4 is a schematic diagram showing example bounding boxes;

FIG. 5 is a schematic diagram showing the relationship between a lineand the corresponding extended line;

FIG. 6 is a schematic diagram showing the four extended diamond edges ofa pixel;

FIG. 7 is a schematic diagram of example line rasterization hardwarethat implements the method of FIG. 3 for a single pixel;

FIG. 8 is a schematic diagram showing the extended diamond edges of agroup of 16 pixels;

FIG. 9 is a schematic diagram showing two extended edges which may beused to calculate results for all the extended diamond edges shown inFIG. 8 ;

FIG. 10 is a schematic diagram of example line rasterization hardwarethat implements the method of FIG. 3 for a group of pixels;

FIG. 11A is a schematic diagram of an example hardware arrangement forperforming normal or conservative rasterization;

FIG. 11B is a schematic diagram showing how the hardware arrangement ofFIG. 11A may additionally be used to perform the line rasterizationmethod of FIG. 3 ;

FIG. 12A is a schematic diagram showing subdivision of a rendering spaceinto tiles and microtiles;

FIG. 12B is a schematic diagram showing a part of FIG. 12A in moredetail;

FIG. 13A is a schematic diagram of example edge test hardware 1300 thatmay be used both to perform conservative rasterization and the linerasterization method of FIG. 3 ;

FIG. 13B is a schematic diagram of another example edge test hardwarethat may be used both to perform conservative rasterization and the linerasterization method of FIG. 3 ;

FIG. 14 shows a computer system in which a graphics processing system isimplemented; and

FIG. 15 shows an integrated circuit manufacturing system for generatingan integrated circuit embodying a graphics processing system.

The accompanying drawings illustrate various examples. The skilledperson will appreciate that the illustrated element boundaries (e.g.,boxes, groups of boxes, or other shapes) in the drawings represent oneexample of the boundaries. It may be that in some examples, one elementmay be designed as multiple elements or that multiple elements may bedesigned as one element. Common reference numerals are used throughoutthe figures, where appropriate, to indicate similar features.

DETAILED DESCRIPTION

The following description is presented by way of example to enable aperson skilled in the art to make and use the invention. The presentinvention is not limited to the embodiments described herein and variousmodifications to the disclosed embodiments will be apparent to thoseskilled in the art.

Embodiments will now be described by way of example only.

As mentioned above, when rendering lines (which may be edges ofprimitives or primitives themselves), line rasterization rules may beused to define how angled lines are handled and in particular todetermine which pixels are used to render the line, i.e. which pixelsare considered to be a visible part of the line and which pixels are notpart of the line (and hence may be referred to as ‘non-visible’). Anexample of a line rasterization rule may be referred to as the ‘diamondexit rule’ and this is defined in the Direct3D 11 graphics specification(as can be found at:https://docs.microsoft.com/en-us/windows/desktop/direct3d11/d3d10-graphics-programming-guide-rasterizer-stage-rules).The diamond exit rule is also used in other standards (e.g. OpenGL).This rule uses a diamond-shaped test area 102 within each pixel 100 (asshown in FIGS. 1A and 1B) to determine if a line covers a pixel and thesame rule may be used for a line strip as it is drawn as a sequence ofconnected lines. According to the Direct3D 11 graphics specification,the diamond test area differs slightly dependent upon the gradient ofthe line, with x-major lines being defined as those lines with a slopethat is in the range −1 to +1 inclusive (such that the line ishorizontal or close to horizontal) and y-major lines being all otherlines. For x-major lines, the diamond test area includes the lower-leftedge 104, the lower-right edge 106 and bottom corner 108 but excludesthe upper-left edge 110, the upper-right edge 112, the top corner 114,the left corner 116 and the right corner 118, as shown in FIG. 1A (withthe excluded lines being shown by dotted lines and the excluded cornersbeing shown as open, rather than black, circles). For y-major lines, thediamond test area additionally includes the right corner 118 (but stillexcludes the upper-left edge 110, the upper-right edge 112, the topcorner 114 and the left corner 116), as shown in FIG. 1B.

Described herein is a method and hardware for rasterising aliased linesand which implements the diamond exit rule (e.g. of Direct3D 11) in anaccurate manner (e.g. such that lines are not rendered as narrowparallelograms) and in an efficient manner, i.e. in terms of the amountof computation that is performed, the area of hardware (and inparticular additional hardware specific to the diamond exit rule) thatis required (and hence physical size) and power consumption. Thishardware, which may be referred to as line rasterization hardware, maybe implemented within the rasterization phase of a graphics processingpipeline (e.g. within a graphics processing unit, GPU).

In various examples, the method of rasterising an aliased line describedherein may be implemented by reusing hardware that is also used forconservative rasterization (or other tasks involving edge testing)within a graphics processing pipeline. By reusing existing hardware, theoverall hardware size is reduced compared to alternative methods thatuse dedicated hardware logic to implement the diamond exit rule.Additionally, by using the method described herein, aliased lines can bedrawn in the same overall strategy/flow as triangle primitives withoutrequiring high level architecture changes.

The methods described herein are compatible with graphics processingpipelines that are arranged to render in a rendering space that issub-divided into a plurality of tiles, where each tile is sub-dividedinto a plurality of microtiles, and each microtile comprises anidentical arrangement of pixels (this is shown graphically in FIG. 12 ,described below). Use of microtiles breaks up the tiles into segmentsthat fully utilise the available computational logic (with emptymicrotiles being culled with a coarse edge test). In contrast, someknown methods of implementing the diamond exit rule are not compatiblewith rasterising methods that use microtiles.

For the purposes of the following description, the diamond test area(which may be referred to simply as the ‘test area’) is as describedabove with reference to FIGS. 1A and 1B. Irrespective of the slope ofthe line, the test area includes the lower-left edge 104 and thelower-right edge 106 but not the upper-left edge 110 and the upper-rightedge 112. The bottom corner 108 only is included in the test area forx-major lines and for y-major lines, both the bottom corner 108 and theright corner 118 are included within the test area.

FIG. 2 is a schematic diagram of an example graphics processing unit(GPU) pipeline 200 which may be implemented in hardware within a GPU andwhich uses a tile-based rendering approach. The hardware describedherein may also be used in a GPU that instead uses alternative renderingapproaches where the rendering processes groups of pixels (e.g. whereimmediate mode rendering is used). As shown in FIG. 2 , the pipeline 200comprises a geometry processing phase 202 and a rasterization phase 204.Data generated by the geometry processing phase 202 may pass directly tothe rasterization phase 204 and/or some of the data may be written tomemory (e.g. parameter memory 205) by the geometry processing phase 202and then read from memory by the rasterization phase 204.

The geometry processing phase 202 comprises a vertex shader 206,tessellation unit 208 and tiling unit 210. Between the vertex shader 206and the tessellation unit (or tessellator) 208 there may be one or moreoptional hull shaders, not shown in FIG. 2 . The geometry processingphase 202 may also comprise other elements not shown in FIG. 2 , such asa memory and/or other elements. Where the GPU pipeline 200 is used forimmediate mode rendering, the tiling unit 210 may be omitted and may bereplaced by an alternative unit that groups pixels in some way in orderto limit the number of pixels for which calculations are performed atany time.

The vertex shader 206 is responsible for performing per-vertexcalculations. Unlike the vertex shader, the hardware tessellation unit208 (and any optional hull shaders) operates per-patch and notper-vertex. The tessellation unit 208 outputs primitives and in systemswhich use vertex indexing, an output primitive may take the form ofthree vertex indices and a buffer of vertex data (e.g. for each vertex,a UV coordinate and in various examples, other parameters such as adisplacement factor and optionally parent UV coordinates). Whereindexing is not used, an output primitive may take the form of threedomain vertices, where a domain vertex may comprise only a UV coordinateor may comprise a UV coordinate plus other parameters (e.g. adisplacement factor and optionally, parent UV coordinates).

The tiling unit 210 generates per-tile display lists and outputs these,for example to the parameter memory 205. Each per-tile display listidentifies, for a particular tile, those primitives which are at leastpartially located within that tile. These display lists may be generatedby the tiling unit 210 using a tiling algorithm. Subsequent elementswithin the GPU pipeline 200, such as the rasterization phase 204, canthen read the data from parameter memory 205.

The rasterization phase 204 rasterises some or all of the primitivesgenerated by the geometry processing phase 202. The rasterization phase204 comprises line rasterization hardware 211 and may also compriseconservative rasterization hardware 212 and/or other elements not shownin FIG. 2 .

The line rasterization hardware 211 implements the diamond exit rule andin particular implements the method of rasterising aliased lines asdescribed below. Where provided, the conservative rasterization hardware212 in the rasterization phase 204 determines, for each pixel and foreach of a plurality of primitives (e.g. each primitive on a per-tiledisplay list), whether the pixel (i.e. the square pixel area, ratherthan a single sample position within the pixel) is partially or fullyoverlapped by the primitive. This is referred to as outer and innercoverage respectively. As described below, whilst the line rasterizationhardware 211 and conservative rasterization hardware 212 are shown asseparate blocks in FIG. 2 , in various examples some of the hardwarelogic may be shared between the blocks and/or the two blocks may bemerged because of the re-use of hardware for both conservativerasterization and line rasterization (in particular using the diamondexit rule).

FIG. 3 is a flow diagram of a method of line rasterization which may beimplemented by the line rasterization hardware 211 in FIG. 2 . As shownin FIG. 3 , the method involves a hierarchy of tests which are appliedto a plurality of pixels for each line which is to be rendered (or foreach line segment of a line strip) and the order in which the tests areperformed results in an efficient implementation (e.g. as it allows thetest preconditions to be calculated at substantially the same time).Whilst a separate final determination is made for each pixel that isconsidered and for each line, computations that are involved in one ofthe tests may be reused for other tests for the same pixel and/or thesame or different tests for different pixels and various examples ofthis will be described below.

The line rasterization method of FIG. 3 takes as inputs the parametersfor the line to be rasterized. In various examples the method may beimplemented for groups of pixels substantially in parallel.

The output of the line rasterization method of FIG. 3 is, for eachpixel, a determination as to whether the pixel is drawn as part of thealiased line (block 302) or not drawn as part of the aliased line (block304). Instead of the terms ‘drawn’ and ‘not drawn’, the terms ‘visible’and ‘not visible’ may alternatively be used, with a pixel that isconsidered visible corresponding to a pixel which forms part of therendered line and is therefore drawn and a pixel that is considered notvisible corresponding to a pixel that does not form part of the renderedline and is therefore not drawn.

The line parameters which are input to the method of FIG. 3 may comprisethe start and end points of the line and coefficients A, B and C fromthe equation of a line that passes through the start and end points ofthe line, which may be a vector of the form:

Ax+By+C

The coefficients A, B and C are constants. This vector may be referredto as the ‘extended line’ because the vector does not specify either thestart or end point but extends beyond both. FIG. 5 is a schematicdiagram showing the relationship between a line 502 and thecorresponding extended line 504. The line 502 starts at the start point506 (indicated by the small circle) and ends at the end point 508(indicated by the arrow head) whilst the extended line passes throughthe start point 506 and then the end point 508 in a direction that isdefined by the coefficients A and B. Alternatively, instead of inputtingthe coefficients A, B and C, the line parameters input to the method ofFIG. 3 may only comprise the start and end points and the method of linerasterization may include an additional block (not shown in FIG. 3 ) inwhich the vector of the extended line is calculated and hence the valuesof the coefficients A, B and C are determined.

As shown in FIG. 3 , the first two tests within the hierarchy comprisefirst testing whether the end point is in the diamond test area (block306) and then testing whether the start point is in the diamond testarea (block 308). If the end point is in the diamond test area (Yes' inblock 306), then the pixel is not drawn as part of the aliased line(block 304). If the end point is not in the diamond test area (‘No’ inblock 306) and the start point is in the diamond test area (Yes' inblock 308) then the pixel is drawn as part of the aliased line (block302).

If, however, the end point is not in the diamond test area (‘No’ inblock 306) and the start point is not the diamond test area (‘No’ inblock 308) then the method progresses to the third test (block 310). Inthis test, a determination is made as to whether the line crosses morethan one extended diamond edge or not. The four extended diamond edges602-608 of a pixel 100 are shown in FIG. 6 . Like the extended lineshown in FIG. 5 and described above, an extended diamond edge passesthrough the two corners at the ends of the diamond edge and extendsbeyond both those corners. In defining the vectors of the extendeddiamond edges, which may have the form ax+by+c, a convention may be usedto determine whether the constant coefficients a and b are positive ornegative and in the example shown in FIG. 6 , the extended diamond edgesare defined as follows:

Diamond edge a coefficient b coefficient Lower-left edge 104 positivenegative Lower-right edge 106 negative negative Upper-right edge 112negative positive Upper-left edge 110 positive positiveThe actual values of the coefficients for a pixel may be obtained from alook-up table (LUT) or otherwise calculated, and various examples aredescribed below. Unlike the coefficients of the extended line (which aredifferent for each line that is rasterized), the coefficients of theextended diamond edges are fixed.

If the line does not cross more than one extended diamond edge (‘No’ inblock 310, where this may be determined as described below), then thepixel is not drawn as part of the aliased line (block 304), but if theline does cross more than one extended diamond edge (Yes' in block 310),then the method progresses to the fourth and fifth tests (blocks 314,316) which may be performed in any order. If the extended line touchesthe right diamond point 118 and is y-major (Yes' in block 314), then thepixel is drawn as part of the aliased line (block 302). If the extendedline touches the bottom diamond point 108 (Yes' in block 316), then thepixel is drawn as part of the aliased line (block 302). If the extendedline does not satisfy either of these tests (‘No’ in both of blocks 314,316), then the method progresses to the sixth test. In the sixth test(block 318), it is determined whether the line is on the same side of,or on, all four of the diamond points 108, 114, 116, 118. If the line ison the same side of, or on, all four diamond points (Yes' in block 318),then the pixel is not drawn as part of the aliased line (block 304), butif the line is not on the same side of, or on, all four of the diamondpoints 108, 114, 116, 118 (‘No’ in block 318), then the pixel is drawnas part of the aliased line (block 302).

After the method of FIG. 3 , a bounding box test may be applied to thosepixels which are marked as drawn (in block 302). This test filters outany pixels that are marked as drawn but which fall outside the boundingbox of the line being rasterized. The bounding box of the line beingrasterized is a rectangle drawn around the line, with the start and endpoints of the line as two diagonally opposite corners of the rectangleand the sides of the rectangle running parallel to the edges of thepixels (e.g. vertically and horizontally) and various example boundingboxes 402 are shown in FIG. 4 .

An example implementation of the method of FIG. 3 can be written as:

-   -   end_edge_inside—this is 4 bits, one for each diamond edge, each        bit indicating whether or not the end point is considered to be        on the ‘inside’ (i.e. on a bottom edge would be included) —this        is the first test (block 306)    -   start_edge_inside—this is 4 bits, one for each diamond edge,        each bit indicating whether or not the start point is considered        to be on the ‘inside’ (i.e. on a bottom edge would be included)        —this is the second test (block 308)    -   quad_cross—this is 1 when bit count(end_edge_inside xor        start_edge_inside)>1 else 0—this is third test (block 310)    -   edge_on—this is an indication as follows, using the test results        of the line edge vs. diamond points:        -   =1 if the line touches the right diamond point and is            y-major—this is the fourth test (block 314)        -   =1 if the line touches the bottom diamond point—this is the            fifth test (block 316)        -   =0 otherwise    -   left_count—this is the number of diamond points that are on or        to the left of the extended line    -   right_count—this is the number of diamond points that are on or        to the right of the extended line    -   if edge_on=0 and (right_count=4 or left_count=4) then        -   crosses=0—this is the sixth test (block 318)            Like FIG. 3 , this example does not include any bounding box            test and this may, for example, be implemented as a mask at            the end of the method on those pixels indicated as drawn            (i.e. to remove from the set of pixels to be drawn any            pixels that are considered drawn but which fall outside the            bounding box).

It will be appreciated that the order of tests shown in FIG. 3 and inthe example implementation above, may be varied without changing theoverall result (i.e. as to whether a pixel is drawn or not drawn). Forexample, the check of whether a line crosses more than one extendeddiamond edge (in block 310) may alternatively be positioned after the‘No’ in the sixth test (block 318).

In various examples, the method of FIG. 3 may be implementedsubstantially in parallel for a group of pixels, e.g. for all the pixelsin a microtile.

Many of the tests in the method of FIG. 3 involve performing an edgetest, i.e. testing whether a sample position (which may be the start orend point of the line or a corner of the diamond test area) is to theleft or the right or on a particular edge (which may be the extendedline or an extended diamond edge). Given an edge defined by a vector ofthe form:

f(x,y)=αx+βy+γ

where α, β and γ are constant coefficients specific to the particularedge (e.g. α=A, β=B and γ=C for the extended line and α=a, β=b and γ=cfor an extended diamond edge) then the edge test may be performed bycalculating the value, or the sign of f(x, y) for the edge (block 402).This is because:

-   -   If f(x, y) is calculated to be positive (i.e. greater than        zero), then the sample position is to the right of the edge    -   If f(x, y) is calculated to be negative (i.e. less than zero),        then the sample position is to the left of the edge    -   If f(x, y) is calculated to be exactly zero, then the sample        position is precisely on the edge

In particular, the first test (block 306) comprises setting the sampleposition to be the end point (i.e. setting the values of x and y to bethe coordinates of the end point) and performing edge tests for each ofthe extended diamond edges (i.e. using values of α, β and γ for each ofthe extended diamond edges). Four edge test results are thereforecalculated for the end point. Using the convention for the extendeddiamond edges shown in FIG. 6 and described above, the end point of theline is considered to be in the diamond test area (Yes' in block 306) ifthe value of each of the edge test results for the end point (e.g. eachcalculated value of f(x, y) with x,y values corresponding to the endpoint) for each of the extended diamond edges meets the followingcriteria:

Extended diamond edge Criteria to be considered in the test areaLower-left edge 104 f(x, y) is positive or zero Lower-right edge 106f(x, y) is positive or zero Upper-right edge 112 If the fractional partof the x-coordinate of the end point is zero and the line is notx-major: f (x, y) is positive or zero Else: f(x, y) is positiveUpper-left edge 110 f(x, y) is positive

Similarly, the second test (block 308), comprises setting the sampleposition to be the start point (i.e. setting the values of x and y to bethe coordinates of the start point) and performing edge tests for eachof the extended diamond edges (i.e. using values of α, β and γ for eachof the extended diamond edges, which are the same as in the first test).Four edge test results are therefore calculated for the start point.Using the convention for the extended diamond edges shown in FIG. 6 anddescribed above, the start point of the line is considered to be in thediamond test area (Yes' in block 308) if the value of each of the edgetest results for the start point (e.g. each calculated value of f(x, y)with x,y values corresponding to the start point) for each of theextended diamond edges meets the criteria set out in the table above.

In the table of criteria given above (and used in the first and secondtests), the condition relating to the upper-right edge (i.e. thedetermination of whether the fractional part of the x-coordinate of theend point is zero and the line is not x-major) checks whether thestart/end point is exactly on the right diamond point. In practice, anytest that is looking to see if the value of f(x, y) is positive or zerocan be implemented by modifying the test by adding or subtracting oneLSB in the final sum/comparison and then still testing to see if theresult is positive, which requires a smaller area of hardware thantesting to see if the result is exactly zero. In particular, instead ofevaluating:

αx+βy+γ>0

the evaluation is:

αx+βy+γ+(one LSB)>0

This means that if αx+βy+γ≥0, the test will return a positive value,whereas it otherwise would not. By adding only a single LSB, only thesign of the exact equivalent case (where αx+βy+γ=0) is changed.

The addition of a single LSB also works when evaluating:

αx+βy+γ+(one LSB)<0

In this evaluation, the addition of a single LSB turns a result thatwould be false, because αx+βy+γ=0 (i.e. a sample point exactly on anedge) into a ‘true’ result. This can be used to redefine whether or nota sample point that lies exactly on an edge is on the inside or outsideof the test area.

The third test (block 310) reuses the edge test results (i.e. the signsor values of f(x, y)) from the first and second tests (blocks 306-308).This is because a line crosses an extended diamond edge if the start andend points have edge test results (e.g. values of f(x, y)) which are ofopposite sign, i.e. the line crosses an extended diamond edge if thestart point is to the left of the extended diamond edge (edge test isnegative) and the end point is to the right of the extended diamond edge(edge test is positive), or if the start point is to the right of theextended diamond edge (edge test is positive) and the end point is tothe left of the extended diamond edge (edge test is negative).

The fourth test (block 314) does not reuse edge test results from thefirst or second tests, but instead comprises setting the sample positionto be the right diamond point 118 (i.e. setting the values of x and y tobe the coordinates of the right diamond point 118) and performing anedge test for the extended line (i.e. using values of α, β and γ for theextended line). The extended line touches the right diamond point 118 ifthe edge test result is exactly zero and the extended line is y-major.

The fifth test (block 316) comprises setting the sample position to bethe bottom diamond point 108 (i.e. setting the values of x and y to bethe coordinates of the bottom diamond point 108) and performing an edgetest for the extended line (i.e. using values of α, β and γ for theextended line). The extended line touches the bottom diamond point 108if the edge test result is exactly zero.

The sixth test (block 318) reuses the edge test results from the fourthand fifth tests, in order to provide the edge test for the extended linein relation to the right diamond point and the bottom diamond point. Inaddition, the sixth test comprises two further edge test calculations inrelation to the extended line (i.e. using values of α, β and γ for theextended line) with the sample positions (i.e. the x and y values) inthese being set to the two remaining diamond points, i.e. the leftdiamond point 116 and the top diamond point 114. Once the four edge testresults are obtained, the sixth test determines whether the extendedline is on the same side of or on all the diamond points, i.e. the signsof each of the edge test results are the same.

FIG. 7 is a schematic diagram of example line rasterization hardwarethat implements the method of FIG. 3 for a single pixel and shows thereuse of results between tests. In the example of FIG. 7 , there arefive edge test hardware logic blocks 702-710 that perform edge testcalculations for each of the extended diamond edges of a diamond testarea within a pixel and for the extended line i.e. which calculates f(x,y). The start and end points of the line are input as sample positionsto each of the four edge test hardware logic blocks corresponding to theextended diamond edges 702-708 along with the coefficients a, b and cfor the extended diamond edges, resulting in four edge test results foreach sample position. The four edge test results for the end point areused in the first test (logic block T1) and the four edge test resultsfor the start point are used in the second test (logic block T2). Thelogic blocks that implement the first and second tests (logic blocks T1,T2) may, for example, comprise logic that implements an AND logicfunction because only if all four inputs (one corresponding to eachextended diamond edge) are positive (e.g. logic one) is the testsatisfied.

Pairs of the edge test results that relate to the same extended diamondedge are combined and used in the third test (logic block T3). As shownin FIG. 7 , pairs of edge results (with each pair comprising an edgeresult for each of the start and end points for the same extendeddiamond edge) may be combined using logic 712 that implements an XORlogic function to determine if the two results have opposite signs (withthe ‘inside’ test being modified as set out in the criteria in the tableabove). The logic block that implements the third test (logic block T3),then determines whether more than one of the pairs comprises resultswith opposite signs by combining the outputs from the XOR logic 712 anddetermining if two or more are logic ones.

The coordinates of each of the diamond points are input as samplepositions to the edge test hardware logic block corresponding to theextended line 710, resulting in four edge test results per pixel, onefor each sample position (i.e. one for each diamond point). Two of theedge test results are used in one of the fourth and fifth tests (logicblocks T4 and T5) and all four edge test results are used in the sixthtest (logic block T6).

Whilst the arrangement shown in FIG. 7 comprises five separate edge testhardware logic blocks, each corresponding to a different edge, it willbe appreciated that in other examples, edge test hardware logic may beused for more than one edge and the edge coefficients (e.g. the valuesof α, β and γ) may be input to the edge test hardware logic along withthe coordinates (e.g. the x, y values) of the sample positions that arebeing compared to the edge by the edge test hardware logic. Similarly,in various examples, the edge test hardware logic blocks may beduplicated where they are used to compute edge values for multiplesample points, so that multiple edge test results (for different samplepositions) can be calculated in parallel.

Furthermore, although there are four extended diamond edges for a pixel(e.g. as shown in FIG. 6 ), as the edges comprises two pairs of paralleledges, it is not necessary to separately perform edge tests for all fourextended diamond edges. Instead, by inverting the edge test result forone of a pair of parallel edges (e.g. the edge test result for thelower-right extended diamond edge 106) and adding a constant, the edgetest result for the other of the pair of parallel edges (e.g. the edgetest result for the upper-left extended diamond edge 110) can beobtained. This therefore halves the number of edge tests that need to beperformed in relation to the extended diamond edges (e.g. in the first,second and third tests in FIG. 3 , blocks 306-310) for a single pixel.The result is inverted because the extended diamond edges are vectors inopposite directions (as shown in FIG. 6 and by the signs of thecoefficients in the table above) and the constant that is added shiftsthe result across from one side of the diamond test area to the other.In various examples, where the size of a pixel is 1×1, the values of aand b may be set to ±0.5, the constant that is added may be set to +0.5to move from the lower-right diamond edge 106 to the upper-left diamondedge 110, or set to −0.5 to move from the upper-right diamond edge 112to the lower-left diamond edge 104. Alternatively, the values of a, band c may all be scaled without changing the pixel size. Consequently,the hardware arrangement for a single pixel as shown in FIG. 7 may bemodified such that there are only two hardware logic units that performedge tests on extended diamond edges and the hardware logic thatperforms the first, second and third tests (T1-T3) may include hardwarelogic to generate the results for the other two extended diamond edgesby modifying the result of the sum of products (SOP), f(x, y) beforetaking the sign (e.g. by inverting results and adding the appropriateconstant, i.e. +0.5 or −0.5).

As well as reusing results for different tests for the same pixel, asdescribed above, where the method of FIG. 3 is implemented substantiallyin parallel for a group of pixels, e.g. for all the pixels in amicrotile, there may be reuse of results between pixels within thegroup. FIG. 8 is a schematic diagram showing a group of 4×4 pixels,which may in various examples comprise a microtile, and the diamond testareas for each of the 16 pixels. Whilst each diamond test area has fourcorners, for the 4×4 grid of pixels there are only 40 different cornerpositions (rather than 16×4=64). These 40 different corner positions maybe considered to be two samples per pixel, e.g. the top diamond point114 and the left diamond point 116 of each of the 16 pixels in the 4×4grid, plus 8 extra pixels: the bottom diamond point 108 of the bottomfour pixels in the grid and the right diamond point 118 of theright-most four pixels in the grid. This reuse of results thereforereduces the number of computations that need to be performed (e.g. inhardware logic) in order to implement the fourth, fifth and sixth testsof the method shown in FIG. 3 (blocks 314-318), i.e. the reuse ofresults reduces the number of edge tests that need to be performed tocompare the diamond points of each pixel to the extended line.

FIG. 8 also shows the extended diamond edges, a subset of which havebeen labelled A-N. It can be seen from FIG. 8 that whilst each diamondtest area has four extended edges, adjacent pixels share extended edges,although in order to maintain the direction convention shown in FIG. 6and described above, it is necessary to invert and adjust the result ofthe SOP, f(x, y), before taking the sign. So, in addition to onlycalculating two edge test results for extended diamond edges for asingle pixel, when considering a group of pixels, the results can befurther reused by performing the relevant inversions and adding amultiple of ±0.5 to shift the edge across the grid of pixels. This meansthat it is not necessary to perform 64 independent edge tests, as wouldotherwise be needed (i.e. 2 edge tests per pixel for each of the startand end points, 2×16×2=64). Instead, a plurality of adders may beprovided to evaluate the 2×14×2 edges (2 sample positions—the start andthe end points, 14 edges A-N, 2 directions) substantially in parallel byreusing results between edges.

In various examples, only four independent edge tests may be performedin relation to the extended diamond edges, with these four testscorresponding to tests for each of two edges for the start point and theend point of the line and then one or more look-up tables (LUTs) may beused to determine, for any extended diamond edge for any of the pixelsin the group being considered (e.g. in a 4×4 pixel microtile) which edgetest result (of the two for the particular sample position) to select,whether to invert the result of that selected edge test and the value ofthe constant to add (where this constant may be zero in some cases). Inother examples, instead of using LUTs to determine whether to invert theresult and/or identify the value of the constant, this may be calculatedin hardware logic.

In various examples, two edge test hardware logic blocks may be used tocalculate edge test results for two perpendicular extended edges 901,903 passing through the origin, as shown in FIG. 9 , and x,y valuescorresponding to the start point and end point (x_(start),y_(start) andx_(end),y_(end)) of the line that is being rasterized. In such examples,the two vectors have the form:

f ₉₀₁(x,y)=−0.5x+0.5y

f ₉₀₃(x,y)=−0.5x−0.5y

And this generates four results: f₉₀₁(x_(start),y_(start)),f₉₀₃(x_(start),y_(start)), f₉₀₁(x_(end),y_(end)), f₉₀₃(x_(end),y_(end)).The edge test results for the extended diamond edges in a pixel are thengiven by:

f(x,y)=i(F+g+h)=iF+ig+ih

Where F is one of the four calculated edge test results (i.e. one off₉₀₁(x_(start),y_(start)), f₉₀₃(x_(start),y_(start)),f₉₀₁(x_(end),y_(end)) and f₉₀₃(x_(end),y_(end)) from above) and thecoefficient i, where i=±1, determines whether the edge test result isinverted or not. The g coefficient shifts the result by a quarter of apixel, g=±0.25, to move the edge away from the origin so that it passesthrough a diamond point and the h coefficient is a multiple of 0.5 andsteps the edge across the grid of pixels. The values of i, F, g and hare fixed per specific diamond edge and may, for example, be determinedfrom one or more LUTs and examples are provided below.

The LUT below may be used to determine the value of i for each diamondedge of each pixel in the 4×4 grid of pixels 800 in FIG. 8 and providesan index (e.g. in terms of the letter of the edge, in the example shown,using the labelling scheme shown in FIG. 8 ) to the second LUT thatprovides the values of g and h and also identifies which of the fourcalculated edge test results are used, i.e. identifies the value of F.Each cell in the LUT below contains four values, arranged as follows:

Upper-left edge Upper-right edge Lower-left edge Lower-right edge

−A +K −B +J −C +I −D +H −K +A +J +B −I +C −H +D −B +L −C +K −D +J −E +I−L +B −K +C −J +D −I +E −C +M −D +L −E +K −F +J −M +C −L +D −K +E −J +F−D +N −E +N −F +L −G +K −N +D −N +E −L +F −K +GWhere a value −X corresponds to 1=−1 and an index of X and a value of +Xcorresponds to i=+1 and an index of X, e.g. the value −A identifies that1=−1 and the index is A.

As shown in the LUT above, the value of i is always −1 for theupper-left and lower-left diamond edges and +1 for the upper-right andlower-right diamond edges in examples where the two edge tests that arecalculated are the two downwards edges 901, 903, as shown in FIG. 9 .The values of i will be different if two different edges are calculated.For edges with the label A-G (i.e. all upper-left and lower-rightedges), the value of F is f₉₀₃(x, y) and for edges with the label H-N(i.e. all upper-right and lower left edges) the value of F is f₉₀₁(x,y). In various examples, a single bit may be stored in the LUT above foreach edge A-N (in addition to the index) that indicates the direction ofthe edge (i.e. parallel to edge 901 or edge 903 in FIG. 9 ), with a oneindicating one direction (e.g. such that the line is parallel to edge901 and F=f₉₀₁(x, y)) and a zero indicating the other direction (e.g.such that the line is parallel to edge 903 and F=f₉₀₃(x, y)). Forexample:

−A,0 +K,1 −B,0 +J,1 −C,0 +1,1 −D,0 +H,1 −K,1 +A,0 +J,1 +B,0 −1,1 +C,0−H,1 +D,0 −B,0 +L,1 −C,0 +K,1 −D,0 +J,1 −E,0 +I,1 −L,1 +B,0 −K,1 +C,0−J,1 +D,0 −I,1 +E,0 −C,0 +M,1 −D,0 +L,1 −E,0 +K,1 −F,0 +J,1 −M,1 +C,0−L,1 +D,0 −K,1 +E,0 −J,1 +F,0 −D,0 +N,1 −E,0 +N,1 −F,0 +L,1 −G,0 +K,1−N,1 +D,0 −N,1 +E,0 −L,1 +F,0 −K,1 +G,0

The LUT below shows the values of g and h for the edges A-N where i=+1and the values of −g and −h where 1=−1. This means that the hardwarelogic that calculates f(x, y) does not need to perform a separatenegation operation for these constants as part of the calculation, butnegation of the selected value of F is still required where 1=−1.Alternatively, the LUT may store +g and +h values where 1=−1 in the sameway as it stores +g and +h values where i=+1.

Where i = +1 Where i = −1 Edge direction g h Edge direction −g −h A Downto left 0.25 +0.5 Up to right 0.25 −0.5 B 0.25 +1.0 0.25 −1.0 C 0.25+1.5 0.25 −1.5 D 0.25 +2.0 0.25 −2.0 E 0.25 +2.5 0.25 −2.5 F 0.25 +3.00.25 −3.0 G 0.25 +3.5 0.25 −3.5 H Down to right 0.25 +1.5 Up to left0.25 −1.5 I 0.25 +1.0 0.25 −1.0 J 0.25 +0.5 0.25 −0.5 K 0.25 0.0 0.250.0 L 0.25 −0.5 0.25 +0.5 M 0.25 −1.0 0.25 +1.0 N 0.25 −1.5 0.25 +1.5

In the example LUT above, the values of g and h are specifiedseparately; however, in various examples, there may instead be a singleparameter which is g+h (or −g−h where 1=−1). In other examples, thevalue of g (and −g) may not be stored in a LUT but instead a constantoffset (of 0.25) may be applied to all edge test results irrespective ofthe value of i.

It will be appreciated that the example LUTs provided above may be usedwhere the two edge tests that are calculated are the two downwards edges901, 903, as shown in FIG. 9 . The values of g and h will be differentif two different edges are calculated.

FIG. 10 is a schematic diagram of example line rasterization hardwarethat implements the method of FIG. 3 for a group of pixels and shows thereuse of results between tests. In the example of FIG. 10 , therasterization hardware comprises the test hardware logic units T1-T6 (asdescribed above with reference to FIG. 7 ) and a first sum of products(SOPs) hardware logic unit 1002 that performs the edge test for theextended line for a subset of all the diamond points of the group ofpixels (as described above). This first SOP hardware logic unit 1002therefore calculates f(x, y)=αx+βy+γ using the coefficients (i.e. valuesof α, β and γ) for the extended line (e.g. A, B and C) which may bereceived as inputs by the first SOP hardware logic unit 1002.Coordinates of the diamond points (e.g. two points per pixel plus theadditional points, as detailed above) may also be input or may beobtained from a LUT as these are fixed points. The results from thefirst SOP hardware logic unit 1002 are input to the test hardware logicunits T4-T6 that perform the fourth, fifth and sixth tests of the methodof FIG. 3 (blocks 314-318) as described above.

The rasterization hardware further comprises two second sum of products(SOPs) hardware logic units 1004. One of these units 1004 calculatesf₉₀₁(x_(start),y_(start)) and f₉₀₁(x_(end),y_(end)) and the othercalculates f₉₀₃(x_(start),y_(start)) and f₉₀₃(x_(end),y_(end)) using thecoordinates of the start and end points of the line that is beingrasterized and which are received as inputs by the second SOP hardwarelogic units 1004. The coefficients in the SOP (e.g. ±0.5) are fixed asdetailed above. The rasterization hardware further comprises memorystoring one or more LUTs 1006 (as detailed above) that store thedifferent constants for the different extended diamond edges for thegroup of pixels (e.g. values of i, g and h) and an addition andcomparison hardware logic unit 1008 that sums an output from a secondSOP hardware logic unit 104 and one or more constants from the LUTs1006, including performing any necessary negation, to generate the edgetest results, f(x, y), for each extended diamond edge of each pixel. Theresults from the addition and comparison hardware logic unit 1008 areinput to the test hardware logic units T1-T3 that perform the first,second and third tests of the method of FIG. 3 (blocks 306-310).

As noted above, instead of using LUTs to determine the values of one ormore of i, g and h, these may be calculated in hardware logic orotherwise determined (e.g. the value of i may be implicit in that it isa consequence of the way the inputs, h or ih, are fed into theadditional and comparison unit 1008 and hence is essentially set, andfixed, within the hardware). In such examples, one or more of the LUTs1006 in FIG. 10 may be replaced by one or more coefficient calculationhardware units. In an example, instead of an LUT that provides thevalues of h and −h, a third SOP hardware logic unit may be provided thatcalculates Ax+By and receives as inputs A=2, 8=−0.5, a set of values ofx which is {0, 1, 2} (which may also be written as [0,2]) and a setvalues of y which is also {0, 1, 2, 3}. This enables the third SOPhardware logic unit to calculate all the necessary values of ih as shownbelow:

x y ih A 1 3 0.5 B 1 2 1 C 1 1 1.5 D 1 0 2 E 2 3 2.5 F 2 2 3 G 2 1 3.5 H1 1 1.5 I 1 2 1 J 1 3 0.5 K 0 0 0 L 0 1 −0.5 M 0 2 −1.0 N 0 3 +1.5It will be appreciated that other combinations of values of A and Band/or sets of input values x,y may alternatively be used to calculatethe values of ih.

The line rasterization hardware shown in FIGS. 7 and 10 above may beimplemented in the rasterization phase 204 of FIG. 2 and in particularmay correspond to the line rasterization hardware 211 in FIG. 2 . WhilstFIG. 2 shows separate line rasterization hardware 211 and conservativerasterization hardware 212, in various examples, the line rasterizationmethod of FIG. 3 may be implemented on the hardware that is also used toperform conservative rasterization or other aspects of rasterization.

FIG. 11A is a schematic diagram of an example hardware arrangement thatmay be used to perform conservative rasterization or normalrasterization. It comprises three instances of edge test hardware1102-1104, each of which performs calculations in relation to adifferent edge of a triangular primitive, or where the primitive is aparallelogram, edge test hardware B, 1103 is configured to performcalculations in relation to two parallel edges of the parallelogram. Asshown in FIG. 11A, each instance of edge test hardware 1102-1104receives as inputs the coefficients of the corresponding primitive edge(A0,B0,C0 for the first edge, A1,B1,C1 for the second edge, A2,B2,C2 forthe third edge and for parallelograms, A0,B0,C2 for the third edge andA1,B1,C3 for the fourth edge). As shown in FIG. 11B, this same hardwarearrangement may be used to perform conservative rasterization, normalrasterization and the line rasterization method of FIG. 3 . Where it isadditionally used to perform line rasterization, edge test hardware B1103 (i.e. the edge test hardware that is arranged to handle twoparallel edges for conservative rasterization) is used to perform thefirst, second and third tests (blocks 306-310) and hence corresponds tothe upper portions 71, 11 of the hardware arrangements shown in FIGS. 7and 10 and described above. Consequently, this instance of the edge testhardware receives as inputs, the x,y coordinates of the start and end ofthe line that is being rasterized. One of the other instances of edgetest hardware, edge test hardware A 1102, is used to perform the fourth,fifth and sixth tests (blocks 314-318) and hence corresponds to thelower portions 72, 12 of the hardware arrangements shown in FIGS. 7 and10 and described above. The third instance of edge test hardware, edgetest hardware C 1104, is not used for line rasterization but is used forconservative or normal rasterization.

This edge test hardware shown in FIGS. 11A and 11B (as described inco-pending application GB1805608.5, filed 5 Apr. 2018) and its use inconservative rasterization (as described in co-pending applicationGB1810719.3, filed 29 Jun. 2018), relies on the regular sub-division ofthe rendering space, as can be described with reference to FIGS. 12A and12B. The rendering space 1200 is divided into a plurality of tiles 1202(which may, for example, be square or rectangular) and each tile isfurther divided into a regular arrangement of smaller areas 1204,referred to as ‘microtiles’. Within each tile 1202 there is apre-defined arrangement of microtiles 1204 and in various examples, allof the microtiles 1204 are the same size. Whilst FIG. 12A shows anarrangement of 5×4 microtiles 1204 within a tile 1202, in other examplesthere may be a different number of microtiles 1204 in each tile 1202.Each microtile 1204 comprises the same number (and arrangement) ofpixels 1206. In the example shown in FIGS. 12A and 12B, each microtile1204 comprises a 4×4 arrangement of 16 pixels 1206 and a microtile 1204may, for example, correspond to the group of pixels 800 shown in FIG. 8and described above.

Given the sub-division of the tile 1202, as described above, thecoordinates of sample positions within a pixel 1206 (e.g. thecoordinates of pixel corners or diamond points, as defined withreference to the tile origin 1210) can be broken down into threecomponents: x and y offsets of the microtile 1204 relative to the tile1202, X_(UT), Y_(UT), x and y pixel positions within the microtile 1204,X_(P), Y_(P) (which are defined relative to the origin of the microtile)and x and y subsample positions within the pixel 1206, X_(S), Y_(S)(which are defined relative to the origin of the pixel), where: (X,Y)=(X_(UT)+X_(P)+X_(S), Y_(UT)+Y_(P)+Y_(S)).

The set of x and y offsets for the plurality of microtiles 1204 relativeto the tile origin 1210 are the same for all tiles, because each tile issubdivided in the same way into microtiles. Similarly, the set of x andy offsets for the plurality of pixels 1206 relative to the microtileorigin (which again may be defined to be the top left corner) are thesame for all microtiles (in any tile). The set of x and y subsamplepositions within a pixel, as defined relative to the pixel origin (whichagain may be defined to be the top left corner), may be the same for allpixels (in any microtile and any tile) and in various examples there mayonly be a single subsample position per pixel.

As described in detail below and shown in FIG. 13A, the edge calculationhardware 1102-1104 may each divide up the hardware that performs thecalculations into a plurality of sub-units which each calculate a partof the result:

-   -   One or more microtile component hardware elements 1302 that        calculate the microtile component(s) of the function, f(x, y),        being calculated by the edge calculation hardware;    -   One or more pixel component hardware elements 1304 that        calculate the pixel position component(s) of the function, f(x,        y), being calculated by the edge calculation hardware; and    -   None, one or more subsample component hardware elements 1306        that calculate the subsample position components of the        function, f(x, y), being calculated by the edge calculation        hardware.        A plurality of adders (e.g. in the form of an addition and        comparison element 1308) may be used to combine the outputs from        the hardware elements in different combinations (such that each        component may be reused in a plurality of different        combinations) to generate a plurality of output results for        different subsamples and/or different pixels (e.g. where there        is a single subsample per pixel). As described above, an edge        test involves a comparison and hence each output result may, in        some examples, only be the sign of the SOP value (e.g. a single        bit), although the full result (comprising all the bits of the        SOP value) may be output in other examples.

In addition to using adders to combine the outputs from the hardwareelements, the edge calculation hardware may further comprise one or moremultiplexers 1310 to select the outputs which are input to an adder (andhence gate out any outputs that are not required) and this enables thehardware to be reconfigurable and be used for more than one type ofcalculation (e.g. for both conservative rasterization and linerasterization). In addition, the inclusion of multiplexers to selectoutputs which are input to an adder enables the hardware describedherein to be configured for a variable number of pixels and/or samples(e.g. to enable support for different anti-aliasing modes).

The particular component results, as generated by the separate hardwarecomponents (i.e. the microtile component hardware elements, the pixelcomponent hardware elements and the subsample component hardwareelements) are re-used for multiple output results. This leads to areduction in the hardware size (e.g. area) and power consumption (e.g.compared to computing each full SOP independently) and enables multipleresults to be generated in parallel. Additionally, by structuring thehardware as described herein, it scales well, i.e. it can be easilyextended to more modes and more output samples.

The example hardware arrangement 1300, shown in FIG. 13A, comprises twomicrotile component hardware elements 1302, a plurality of pixelcomponent hardware elements 1304, two subsample component hardwareelements 1306 and a plurality of addition and comparison elements (whichmay, for example, be implemented as a plurality of adders) 1308, witheach addition and comparison element 1308 generating a separate outputresult (i.e. from a different combination of inputs from the variousinputs, e.g. for a different sample position within the same microtile).Where the hardware arrangement 1300 is used for conservativerasterization there may be at least 25 pixel component hardware elements1304, i.e. at least one for each corner of a pixel in a microtile, andat least 25 addition and comparison elements (which may, for example, beimplemented as a plurality of adders) 1308, with each addition andcomparison element 1308 generating an output result for a differentpixel corner within the same microtile.

The hardware arrangement 1300 may additionally comprise one or moremultiplexers 1310 that connect the pixel component hardware elements1304, subsample component hardware elements 1306 and optionally themicrotile component hardware elements 1302 to the addition andcomparison elements 1308. In examples that include multiplexers 1310,one or more select signals control the operation of the multiplexers1310 and in particular control which combination of the hardwareelements 1302, 1304, 1306 are connected to each particular addition andcomparison element 1308.

The hardware arrangement 1300 shown in FIG. 13A is configured to operateas the edge test hardware B 1103 in FIG. 11B and hence is configured toperform the first three tests of the method of FIG. 3 . The otherinstances of edge test hardware in FIG. 11B may comprise a similararrangement of hardware logic blocks as shown in FIG. 13A, but thenumbers of each of the microtile component hardware elements 1302, pixelcomponent hardware elements 1304 and subsample component hardwareelements 1306 may differ. For example, in edge test hardware A 1102(which is arranged to perform the fourth, fifth and sixth tests of themethod of FIG. 3 ), there may be only one microtile component hardwareelement 1302, a plurality of pixel component hardware elements 1304(e.g. at least 25) and a plurality of subsample component hardwareelements 1306 (e.g. at least four).

If, as described above, the edge test hardware 1300 evaluates a SOP ofthe form:

f(x,y)=αx+βy+γ

where the values of the coefficients α, β, γ may be different for eachSOP evaluated, then the microtile component hardware element 1302evaluates:

f _(UT)(x _(UT) ,y _(UT))=αx _(UT) +βy _(UT)+γ

For all instances of edge test hardware 1102-1104 when used forconservative rasterization and edge test hardware A 1102 when used forline rasterization, values of x_(UT) and y_(UT) are the microtilecoordinates relative to the tile origin 1210 and differ for differentmicrotiles. The microtile component hardware element 1302 may receive,as inputs, the values of α, β, γ, x_(UT) and y_(UT) and the elementoutputs a single result f_(UT). For conservative rasterization, thevalues of α, β, γ are the coefficients of the particular primitive edge(shown as Ax,Bx,Cx in FIGS. 11A and 11B, where x=[0,3]) and for edgetest hardware A 1102 when used for line rasterization, the values of α,β, γ are the coefficients of the line being rasterised (shown as A,B,Cin FIG. 11B). The values of the inputs for edge test hardware B whenused for line rasterization are described below.

The pixel component hardware elements 1304 evaluate:

f _(P)(x _(P) ,y _(P))=αx _(P) +βy _(P)

for different values of x_(P) and y_(P). For all instances of edge testhardware 1102-1104 when used for conservative rasterization these valuesdiffer for different pixel corners within a microtile. In these cases,the set of values of x_(P) and y_(P) (i.e. the values of x_(P) and y_(P)for all pixel corners within a microtile, as defined relative to themicrotile origin) is the same for all microtiles and they may, forexample, be calculated by the edge test hardware 1300 or may be accessedfrom a look-up table (LUT). In various examples, the origin of amicrotile may be defined as the top left corner of each microtile andthe values of x_(P) and y_(P) may be integers and so the determinationof the values requires little or no computation (and hence this providesan efficient implementation). Referring back to the example shown inFIG. 12A, where each microtile comprises four rows of four pixels andhence there are five rows of five pixel corners 1212 as shown in FIG.12B, then the set of values of x_(P) is {0, 1, 2, 3, 4} (which may alsobe written as [0,4]) and the set values of y_(P) is {0, 1, 2, 3, 4}(which may also be written [0,4]). Each pixel component hardware element1304 receives as input A and B and may also receive the set of values ofx_(P) and y_(P) (e.g. in examples where these are not integers). Eachelement 1304 outputs a single result f_(P) and consequently thecalculation of f_(P) may be merged with any calculations that areperformed to determine x_(P) and/or y_(P). The values of the inputs foredge test hardware B when used for line rasterization are describedbelow.

The subsample component hardware element 1306, where provided,evaluates:

f _(S)(x _(S) ,y _(S))=αx _(S) +βy _(S)

For instances of edge test hardware 1102-1104 when used for conservativerasterization where there is only a single subsample position per pixeland there is only a single value of x_(S) and y_(S), there is only onevalue of f_(S) and the value of f_(S) may be set to zero. The values ofthe inputs for edge test hardware B when used for line rasterization aredescribed below.

For all instances of edge test hardware 1102-1104 when used forconservative rasterization and edge test hardware A 1102 when used forline rasterization, the addition and comparison elements 1308 evaluate:

f(x,y)=f _(UT) +f _(P) +f _(S)

and for edge test hardware B when used for line rasterization or forevaluating two parallel primitive edges in conservative rasterisation,the addition and comparison elements 1308 also evaluate:

f(x,y)=f _(UT) −f _(P) −f _(S)

Each addition and comparison element 1308 combines a differentcombination of f_(UT), f_(P) and f_(S) values (where the particularcombination of values are provided as inputs to the addition andcomparison unit 1308) and the combination is either fixed (i.e.hardwired between the elements) or is selected by one or moremultiplexers 1310 (where provided). To perform an edge test only the MSB(or sign-bit) of the result (i.e. of f(x, y)) may be output and hence insuch examples the full result does not need to be calculated by theaddition and comparison element 1308 and the addition and comparisonelement 1308 may perform a comparison rather than an addition (whichreduces the overall area of the hardware). This MSB indicates the signof the result (because a>b===sign (b−a)) and, as described above, thisindicates whether the pixel corner is to the left or right of the edge.In other examples, the full result may be generated and output.

Where the hardware arrangement 1300 is used for conservativerasterization, edge test results corresponding to the four corners of apixel that are output by the hardware arrangement 1300 for a particularprimitive edge are combined using an OR logic function to determine anouter coverage result and an AND logic function to determine an innercoverage results. These inner and outer coverage results for the samepixel are combined with corresponding results for the same pixelgenerated by other such hardware arrangements that calculate SOPs fordifferent primitive edges (i.e. the other instances of edge testhardware as shown in FIGS. 11A and 11B), using an AND logic function1106 to generate the outer and inner coverage results for a primitiveand a particular pixel.

Where the hardware arrangement 1300 is used for edge test hardware B1103 for line rasterization, a first subsample component hardwareelement 1306, which calculates f_(S)(x_(S), y_(S))=αx_(S)+βy_(S), may beused to calculate f₉₀₁(x_(start),y_(start)) and f₉₀₁(x_(end),y_(end)) byinputting α=−0.5 and β=+0.5 along with x_(start),y_(start) andx_(end),y_(end). This first subsample component hardware element 1306therefore corresponds to one of the second SOP hardware logic units 1004in the hardware arrangement of FIG. 10 . A second subsample componenthardware element 1306 may be used to calculate f₉₀₃(x_(start),y_(start))and end, f₉₀₃(x_(end),y_(end)) by inputting α=−0.5 and β=−0.5 along withx_(start),y_(start) and x_(end),y_(end). and hence corresponds to theother of the second SOP hardware logic units 1004 in the hardwarearrangement of FIG. 10 . As for line rasterization the x, y coordinatesfor the start and end of the line being rasterised are high precision,unlike the subsample coordinates (e.g. when the hardware is used foredge calculations as part of conservative rasterization or in edge testhardware A for line rasterisation), when inputting values to thesubsample component hardware element 1306, the values of α and β may beimplicit constants (replacing the subsample locations) whilst the valuesof x_(start),y_(start) and x_(end),y_(end) are input instead of the a, bcoefficients in subsample calculations for conservative rasterization.

In various examples, conservative rasterization may not use subsamplecomponent hardware elements and hence these may be only used for linerasterization or they may be used for both line rasterization andconservative rasterization but when used for conservative rasterization,the inputs may be provided such that f_(S)(x_(S), y_(S))=0, e.g.x_(S)=y_(S)=0 or α=β=0. Furthermore as multiplying by ±0.5 does notrequire any multiplication, the subsample component hardware elements,where used in edge test hardware B 1103 for line rasterization only, maybe implemented using a small number of addition and subtraction logicelements.

In edge test hardware B 1103 for line rasterization, the two microtilecomponent hardware elements 1302 may be used to calculate g and −g andthe pixel component hardware elements 1304 may be used to calculate ih.In order to calculate the values of ih, the pixel component hardwareelements 1304, which calculate f_(P)(x_(P), y_(P))=αx_(P)+βy_(P), mayreceive as inputs α=2, β=−0.5, a set of values of x_(P) which is {0, 1,2} (which may also be written as [0,3]) and a set values of y_(P) whichis also {0, 1, 2, 3}.

The second example hardware arrangement 1320, shown in FIG. 13B, is avariation on the hardware arrangement 1300 shown in FIG. 13A. Thissecond example hardware arrangement 1320 comprises two single microtilecomponent hardware elements 1202, a plurality of pixel componenthardware elements 1324 (although these operate slightly differently tothose shown in FIG. 13A and described above), one or more subsamplecomponent elements 1306 and a plurality of comparison elements (whichmay, for example, be implemented as a plurality of adders) 1328(although these operate slightly differently to the addition andcomparison elements 1308 shown in FIG. 13A and described above), witheach comparison element 1328 generating an output result. Like thehardware arrangement 1300 shown in FIG. 13A, the hardware arrangement1320 shown in FIG. 13B may additionally comprise one or moremultiplexers 1310 controlled by select signals.

The microtile component hardware elements 1302 in FIG. 13B operate asdescribed above with reference to FIG. 13A; however, instead of theoutput being fed directly into the comparison element 1328 (as shown inFIG. 13A), in the arrangement 1320 of FIG. 13B, the output of themicrotile component hardware elements 1302 are input to each of theplurality of pixel component hardware elements 1324. The pixel componenthardware elements 1324 in the arrangement 1320 of FIG. 13B do notoperate in the same way as those shown in FIG. 13A. They receive asinput (in addition to A and B) the output from the microtile componenthardware element 1302, f_(UT), and evaluate:

f _(UT)(x _(UT) ,y _(UT))+f _(P)(x _(P) ,y _(P))=f _(UT)(x _(UT) ,y_(UT))+αx _(P) +βy _(P)

and for edge test hardware B when used for line rasterization or forevaluating two parallel primitive edges in conservative rasterisation,the pixel component hardware elements 1324 also evaluate:

f _(UT)(x _(UT) ,y _(UT))−f _(P)(x _(P) ,y _(P))=f _(UT)(x _(UT) ,y_(UT))−αx _(P) −βy _(P)

As described above (with reference to FIG. 13A) for all instances ofedge test hardware used for conservative rasterization and edge testhardware A used for line rasterisation, the values of x_(P) and y_(P)(i.e. the values of x_(P) and y_(P) for all pixel corners within amicrotile, as defined relative to the microtile origin) may be integersand hence the pixel component hardware elements 1324 may comprise anarrangement of adders to add the appropriate multiples of α and/or β tothe input value generated by the microtile component hardware element,f_(UT), 1302 and this may be implemented without using any multipliersand this reduces the size and/or power consumption of the pixelcomponent hardware elements 1324. Each element 1324 outputs a singleresult f_(UT)+f_(P) or f_(UT)−f_(P) and as described above, thecalculation of f_(P) and hence the calculation of the single result maybe merged with any calculations that are performed to determine x_(P)and/or y_(P).

The comparison elements 1328 evaluate:

f(x,y)=f _(UT) +f _(P) +f _(S)

and for edge test hardware B when used for line rasterization or forevaluating two parallel primitive edges in conservative rasterisation,the comparison elements 1328 also evaluate:

f(x,y)=f _(UT) −f _(P) −f _(S)

in a similar manner to the addition and comparison elements 1308described above; however the inputs are different since the values off_(UT) and f_(P) have already been combined in the pixel componenthardware elements 1324. Each comparison element 1328 combines adifferent combination of (f_(UT)±f_(P)) and f_(S) values (where theparticular combinations of values are provided as inputs to thecomparison units 1328) and the combination is either fixed (i.e.hardwired) or is selected by one or more multiplexers 1210 (whereprovided). To perform an edge test only the MSB (or sign-bit) of theresult (i.e. of f(x, y)) may be output and hence the full result doesnot need to be calculated by the comparison elements 1328. This MSBindicates the sign of the result and, as described above, this indicateswhether the subsample position is to the left or right of the edge. Inother examples, the full result may be calculated and output.

The hardware arrangement 1320 shown in FIG. 13B may utilize the factthat the value of f_(P) can be calculated quickly or alternatively themicrotile calculation may be performed in the previous pipeline stage.By using this arrangement 1320 the overall area of the hardwarearrangement may be reduced compared to the arrangement 1300 shown inFIG. 13A (e.g. the comparison elements 1328 may be smaller than additionand comparison elements 1308); however, each of the results output bythe pixel component hardware elements 1324 comprises more bits (e.g.approximately 15 more bits for conservative rasterisation) than in thearrangement 1300 shown in FIG. 13A.

Where the hardware arrangement 1320 is used for conservativerasterization, and in the same way as described above with reference toFIG. 13A, edge test results output from the hardware arrangement 1320corresponding to the four corners of a pixel are combined using an ORlogic function to determine an outer coverage result and an AND logicfunction to determine an inner coverage results. These inner and outercoverage results for the same pixel are combined with correspondingresults for the same pixel generated by other such hardware arrangementsthat calculate SOPs for different primitive edges (i.e. the otherinstances of edge test hardware as shown in FIGS. 11A and 11B), using anAND logic function 1106 to generate the outer and inner coverage resultsfor a primitive and a particular pixel.

In a similar manner to the hardware arrangement 1300, where the hardwarearrangement 1320 is used for line rasterization, a first subsamplecomponent hardware element 1306 may be used to calculatef₉₀₁(x_(start),y_(start)) and f₉₀₁(x_(end),y_(end)) and hencecorresponds to one of the second SOP hardware logic units 1004 in thehardware arrangement of FIG. 10 . A second subsample component hardwareelement 1306 may be used to calculate f₉₀₃(x_(start),y_(start)) andf₉₀₃(x_(end),y_(end)) and hence corresponds to the other of the secondSOP hardware logic units 1004 in the hardware arrangement of FIG. 10 .The microtile component hardware element 1302 may be used to calculate gand −g and the pixel component hardware elements 1324 may be used tocalculate ih such that the combined constants g+h or −g−h are generatedand input to the comparison element 1328.

By using the same hardware to perform both conservative rasterization(or any other process involving multiple edge calculations performed inthe rasterization phase) and line rasterization, the overall size andpower consumption of the rasterization phase 204 is reduced, ordescribed in a different way, line rasterization functionality can beadded to the rasterization hardware with only a small increase in sizeand power consumption. Additionally the line rasterization can beprocessed in the same logic path as triangle primitives, and as a resultthe surrounding hardware logic blocks do not need to be changed tosupport line rasterization. Furthermore, when evaluating the same endpoints (e.g. start and end points of the line) for multiple microtilesonly a few bits will change (e.g. in the microtile component hardwareelements and/or subsample component hardware elements). This means thatevaluating the line for multiple microtiles in a row can be done withfairly minimal power.

The inputs used by the hardware for the first, second and third tests ofthe method of FIG. 3 are very different in nature to the inputs used forconservative rasterization. For conservative rasterization (and thefourth, fifth and sixth tests of the line rasterization method), theedges are variable (because they correspond to primitive edges) and thesample positions are fixed; whereas for the first, second and thirdtests of the line rasterization method, the sample positions arevariable (as they are the start and end points of the line beingrasterized) and the edges are fixed (as they correspond to extendeddiamond edges which are defined by pixel positions).

In the examples described above, particular calculations are describedas being implemented in particular logic elements (e.g. in thearrangements shown in FIGS. 13A and 13B). It will be appreciated,however, that in other examples, some of the calculations may beperformed in different logic elements to those described.

In the examples described above, various combinations of coefficientsare provided (e.g. values of a, b, c and A, B, C, etc.). It will beappreciated that the particular coefficients are provided by way ofexample and other combinations may alternatively be used (e.g. thecombinations of coefficients provided herein may be scaled to providefurther combinations of coefficients that may be used).

In the examples described above, the first and second tests (blocks 306,308) in FIG. 3 are performed using edge tests on the extended diamondedges (e.g. as shown in FIG. 7 ). In other examples, however, thesetests may be implemented in a different way. For example, the x and ydeltas between the pixel centre (marked with an X in FIG. 1 ) and thestart/end point coordinates. The signs of the deltas may be used toevaluate the precise on-edge/corner conditions. In comparison to themethod described above, using signs of deltas requires new logic andcannot reuse logic that can also be used for conservative rasterization.

FIG. 14 shows a computer system in which the graphics processing systemsdescribed herein may be implemented. The computer system comprises a CPU1402, a GPU 1404, a memory 1406 and other devices 1414, such as adisplay 1416, speakers 1418 and a camera 1420. The GPU 1404 comprisesline rasterization hardware 1422 and this may comprise a hardwarearrangement as described above and shown in any of FIGS. 7, 10, 11 and13 . The components of the computer system can communicate with eachother via a communications bus 1424. Where LUTs (e.g. as shown in FIG.10 ) are used by the line rasterization hardware 1422, these may bestored within the GPU 1404 or in the memory 1406

The hardware of FIGS. 2, 7, 10, 11A, 11B, 13A, 13B and 14 are shown ascomprising a number of functional blocks. This is schematic only and isnot intended to define a strict division between different logicelements of such entities. Each functional block may be provided in anysuitable manner. It is to be understood that intermediate valuesdescribed herein as being formed by a particular logic block need not bephysically generated by that logic block at any point and may merelyrepresent logical values which conveniently describe the processingperformed by the hardware arrangement between its input and output.

The line rasterization hardware described herein may be embodied inhardware on an integrated circuit. The line rasterization hardwaredescribed herein may be configured to perform any of the methodsdescribed herein. Generally, any of the functions, methods, techniquesor components described above can be implemented in software, firmware,hardware (e.g., fixed logic circuitry), or any combination thereof. Theterms “module,” “functionality,” “component”, “element”, “unit”, “block”and “logic” may be used herein to generally represent software,firmware, hardware, or any combination thereof. In the case of asoftware implementation, the module, functionality, component, element,unit, block or logic represents program code that performs the specifiedtasks when executed on a processor. The algorithms and methods describedherein could be performed by one or more processors executing code thatcauses the processor(s) to perform the algorithms/methods. Examples of acomputer-readable storage medium include a random-access memory (RAM),read-only memory (ROM), an optical disc, flash memory, hard disk memory,and other memory devices that may use magnetic, optical, and othertechniques to store instructions or other data and that can be accessedby a machine.

The terms computer program code and computer readable instructions asused herein refer to any kind of executable code for processors,including code expressed in a machine language, an interpreted languageor a scripting language. Executable code includes binary code, machinecode, bytecode, code defining an integrated circuit (such as a hardwaredescription language or netlist), and code expressed in a programminglanguage code such as C, Java or OpenCL. Executable code may be, forexample, any kind of software, firmware, script, module or librarywhich, when suitably executed, processed, interpreted, compiled,executed at a virtual machine or other software environment, cause aprocessor of the computer system at which the executable code issupported to perform the tasks specified by the code.

A processor, computer, or computer system may be any kind of device,machine or dedicated circuit, or collection or portion thereof, withprocessing capability such that it can execute instructions. A processormay be any kind of general purpose or dedicated processor, such as aCPU, GPU, System-on-chip, state machine, media processor, anapplication-specific integrated circuit (ASIC), a programmable logicarray, a field-programmable gate array (FPGA), physics processing units(PPUs), radio processing units (RPUs), digital signal processors (DSPs),general purpose processors (e.g. a general purpose GPU),microprocessors, any processing unit which is designed to acceleratetasks outside of a CPU, etc. A computer or computer system may compriseone or more processors. Those skilled in the art will realize that suchprocessing capabilities are incorporated into many different devices andtherefore the term ‘computer’ includes set top boxes, media players,digital radios, PCs, servers, mobile telephones, personal digitalassistants and many other devices.

It is also intended to encompass software which defines a configurationof hardware as described herein, such as HDL (hardware descriptionlanguage) software, as is used for designing integrated circuits, or forconfiguring programmable chips, to carry out desired functions. That is,there may be provided a computer readable storage medium having encodedthereon computer readable program code in the form of an integratedcircuit definition dataset that when processed (i.e. run) in anintegrated circuit manufacturing system configures the system tomanufacture a graphics processing unit configured to perform any of themethods described herein, or to manufacture a graphics processing unitcomprising any apparatus described herein. An integrated circuitdefinition dataset may be, for example, an integrated circuitdescription.

Therefore, there may be provided a method of manufacturing, at anintegrated circuit manufacturing system, line rasterization hardware asdescribed herein. Furthermore, there may be provided an integratedcircuit definition dataset that, when processed in an integrated circuitmanufacturing system, causes the method of manufacturing linerasterization hardware to be performed.

An integrated circuit definition dataset may be in the form of computercode, for example as a netlist, code for configuring a programmablechip, as a hardware description language defining an integrated circuitat any level, including as register transfer level (RTL) code, ashigh-level circuit representations such as Verilog or VHDL, and aslow-level circuit representations such as OASIS® and GDSII. Higher levelrepresentations which logically define an integrated circuit (such asRTL) may be processed at a computer system configured for generating amanufacturing definition of an integrated circuit in the context of asoftware environment comprising definitions of circuit elements andrules for combining those elements in order to generate themanufacturing definition of an integrated circuit so defined by therepresentation. As is typically the case with software executing at acomputer system so as to define a machine, one or more intermediate usersteps (e.g. providing commands, variables etc.) may be required in orderfor a computer system configured for generating a manufacturingdefinition of an integrated circuit to execute code defining anintegrated circuit so as to generate the manufacturing definition ofthat integrated circuit.

An example of processing an integrated circuit definition dataset at anintegrated circuit manufacturing system so as to configure the system tomanufacture line rasterization hardware will now be described withrespect to FIG. 15 .

FIG. 15 shows an example of an integrated circuit (IC) manufacturingsystem 1502 which is configured to manufacture line rasterizationhardware as described in any of the examples herein. In particular, theIC manufacturing system 1502 comprises a layout processing system 1504and an integrated circuit generation system 1506. The IC manufacturingsystem 1502 is configured to receive an IC definition dataset (e.g.defining line rasterization hardware as described in any of the examplesherein), process the IC definition dataset, and generate an IC accordingto the IC definition dataset (e.g. which embodies line rasterizationhardware as described in any of the examples herein). The processing ofthe IC definition dataset configures the IC manufacturing system 1502 tomanufacture an integrated circuit embodying line rasterization hardwareas described in any of the examples herein.

The layout processing system 1504 is configured to receive and processthe IC definition dataset to determine a circuit layout. Methods ofdetermining a circuit layout from an IC definition dataset are known inthe art, and for example may involve synthesising RTL code to determinea gate level representation of a circuit to be generated, e.g. in termsof logical components (e.g. NAND, NOR, AND, OR, MUX and FLIP-FLOPcomponents). A circuit layout can be determined from the gate levelrepresentation of the circuit by determining positional information forthe logical components. This may be done automatically or with userinvolvement in order to optimise the circuit layout. When the layoutprocessing system 1504 has determined the circuit layout it may output acircuit layout definition to the IC generation system 1506. A circuitlayout definition may be, for example, a circuit layout description.

The IC generation system 1506 generates an IC according to the circuitlayout definition, as is known in the art. For example, the ICgeneration system 1506 may implement a semiconductor device fabricationprocess to generate the IC, which may involve a multiple-step sequenceof photo lithographic and chemical processing steps during whichelectronic circuits are gradually created on a wafer made ofsemiconducting material. The circuit layout definition may be in theform of a mask which can be used in a lithographic process forgenerating an IC according to the circuit definition. Alternatively, thecircuit layout definition provided to the IC generation system 1506 maybe in the form of computer-readable code which the IC generation system1506 can use to form a suitable mask for use in generating an IC.

The different processes performed by the IC manufacturing system 1502may be implemented all in one location, e.g. by one party.Alternatively, the IC manufacturing system 1502 may be a distributedsystem such that some of the processes may be performed at differentlocations, and may be performed by different parties. For example, someof the stages of: (i) synthesising RTL code representing the ICdefinition dataset to form a gate level representation of a circuit tobe generated, (ii) generating a circuit layout based on the gate levelrepresentation, (iii) forming a mask in accordance with the circuitlayout, and (iv) fabricating an integrated circuit using the mask, maybe performed in different locations and/or by different parties.

In other examples, processing of the integrated circuit definitiondataset at an integrated circuit manufacturing system may configure thesystem to manufacture line rasterization hardware without the ICdefinition dataset being processed so as to determine a circuit layout.For instance, an integrated circuit definition dataset may define theconfiguration of a reconfigurable processor, such as an FPGA, and theprocessing of that dataset may configure an IC manufacturing system togenerate a reconfigurable processor having that defined configuration(e.g. by loading configuration data to the FPGA).

In some embodiments, an integrated circuit manufacturing definitiondataset, when processed in an integrated circuit manufacturing system,may cause an integrated circuit manufacturing system to generate adevice as described herein. For example, the configuration of anintegrated circuit manufacturing system in the manner described abovewith respect to FIG. 15 by an integrated circuit manufacturingdefinition dataset may cause a device as described herein to bemanufactured.

In some examples, an integrated circuit definition dataset could includesoftware which runs on hardware defined at the dataset or in combinationwith hardware defined at the dataset. In the example shown in FIG. 15 ,the IC generation system may further be configured by an integratedcircuit definition dataset to, on manufacturing an integrated circuit,load firmware onto that integrated circuit in accordance with programcode defined at the integrated circuit definition dataset or otherwiseprovide program code with the integrated circuit for use with theintegrated circuit.

Those skilled in the art will realize that storage devices utilized tostore program instructions can be distributed across a network. Forexample, a remote computer may store an example of the process describedas software. A local or terminal computer may access the remote computerand download a part or all of the software to run the program.Alternatively, the local computer may download pieces of the software asneeded, or execute some software instructions at the local terminal andsome at the remote computer (or computer network). Those skilled in theart will also realize that by utilizing conventional techniques known tothose skilled in the art that all, or a portion of the softwareinstructions may be carried out by a dedicated circuit, such as a DSP,programmable logic array, or the like.

The methods described herein may be performed by a computer configuredwith software in machine readable form stored on a tangible storagemedium e.g. in the form of a computer program comprising computerreadable program code for configuring a computer to perform theconstituent portions of described methods or in the form of a computerprogram comprising computer program code means adapted to perform allthe steps of any of the methods described herein when the program is runon a computer and where the computer program may be embodied on acomputer readable storage medium. Examples of tangible (ornon-transitory) storage media include disks, thumb drives, memory cardsetc. and do not include propagated signals. The software can be suitablefor execution on a parallel processor or a serial processor such thatthe method steps may be carried out in any suitable order, orsimultaneously.

The hardware components described herein may be generated by anon-transitory computer readable storage medium having encoded thereoncomputer readable program code.

Memories storing machine executable data for use in implementingdisclosed aspects can be non-transitory media. Non-transitory media canbe volatile or non-volatile. Examples of volatile non-transitory mediainclude semiconductor-based memory, such as SRAM or DRAM. Examples oftechnologies that can be used to implement non-volatile memory includeoptical and magnetic memory technologies, flash memory, phase changememory, resistive RAM.

A particular reference to “logic” refers to structure that performs afunction or functions. An example of logic includes circuitry that isarranged to perform those function(s). For example, such circuitry mayinclude transistors and/or other hardware elements available in amanufacturing process. Such transistors and/or other elements may beused to form circuitry or structures that implement and/or containmemory, such as registers, flip flops, or latches, logical operators,such as Boolean operations, mathematical operators, such as adders,multipliers, or shifters, and interconnect, by way of example. Suchelements may be provided as custom circuits or standard cell libraries,macros, or at other levels of abstraction. Such elements may beinterconnected in a specific arrangement. Logic may include circuitrythat is fixed function and circuitry can be programmed to perform afunction or functions; such programming may be provided from a firmwareor software update or control mechanism. Logic identified to perform onefunction may also include logic that implements a constituent functionor sub-process. In an example, hardware logic has circuitry thatimplements a fixed function operation, or operations, state machine orprocess.

The implementation of concepts set forth in this application in devices,apparatus, modules, and/or systems (as well as in methods implementedherein) may give rise to performance improvements when compared withknown implementations. The performance improvements may include one ormore of increased computational performance, reduced latency, increasedthroughput, and/or reduced power consumption. During manufacture of suchdevices, apparatus, modules, and systems (e.g. in integrated circuits)performance improvements can be traded-off against the physicalimplementation, thereby improving the method of manufacture. Forexample, a performance improvement may be traded against layout area,thereby matching the performance of a known implementation but usingless silicon. This may be done, for example, by reusing functionalblocks in a serialised fashion or sharing functional blocks betweenelements of the devices, apparatus, modules and/or systems. Conversely,concepts set forth in this application that give rise to improvements inthe physical implementation of the devices, apparatus, modules, andsystems (such as reduced silicon area) may be traded for improvedperformance. This may be done, for example, by manufacturing multipleinstances of a module within a predefined area budget.”

Any range or device value given herein may be extended or alteredwithout losing the effect sought, as will be apparent to the skilledperson.

It will be understood that the benefits and advantages described abovemay relate to one embodiment or may relate to several embodiments. Theembodiments are not limited to those that solve any or all of the statedproblems or those that have any or all of the stated benefits andadvantages.

Any reference to ‘an’ item refers to one or more of those items. Theterm ‘comprising’ is used herein to mean including the method blocks orelements identified, but that such blocks or elements do not comprise anexclusive list and an apparatus may contain additional blocks orelements and a method may contain additional operations or elements.Furthermore, the blocks, elements and operations are themselves notimpliedly closed.

The steps of the methods described herein may be carried out in anysuitable order, or simultaneously where appropriate. The arrows betweenboxes in the figures show one example sequence of method steps but arenot intended to exclude other sequences or the performance of multiplesteps in parallel. Additionally, individual blocks may be deleted fromany of the methods without departing from the spirit and scope of thesubject matter described herein. Aspects of any of the examplesdescribed above may be combined with aspects of any of the otherexamples described to form further examples without losing the effectsought. Where elements of the figures are shown connected by arrows, itwill be appreciated that these arrows show just one example flow ofcommunications (including data and control messages) between elements.The flow between elements may be in either direction or in bothdirections.

The applicant hereby discloses in isolation each individual featuredescribed herein and any combination of two or more such features, tothe extent that such features or combinations are capable of beingcarried out based on the present specification as a whole in the lightof the common general knowledge of a person skilled in the art,irrespective of whether such features or combinations of features solveany problems disclosed herein. In view of the foregoing description itwill be evident to a person skilled in the art that variousmodifications may be made within the scope of the invention.

What is claimed is:
 1. A method of rasterising a line in a graphicsprocessing pipeline, the line having a start point and an end point, themethod comprising, for a pixel: determining whether: a) an extended linepassing through the start and end point has a slope less than −1 orgreater than +1 and touches the right point of a diamond test areawithin the pixel; b) the extended line touches the bottom point of thediamond test area; c) the extended line is not on a same side of eachpoint of the diamond test area; and in response to determining that atleast one of a), b) or c) is true, adding the pixel to be drawn as partof the line.
 2. The method according to claim 1, further comprising:rasterizing the line using those pixels to be drawn as part of the line.3. The method according to claim 1, further comprising: receiving, asinputs, coefficients of the extended line, and wherein: determining thatat least one of a), b) or c) is true is based on the coefficients of theextended line.
 4. The method according to claim 1, further comprising:calculating coefficients of the extended line based on coordinates ofthe start point and the end point, and wherein: determining that atleast one of a), b) or c) is true is based on the coefficients of theextended line.
 5. The method according to claim 1, further comprising:generating, for each pixel, a first set of edge test results, the firstset of edge test results comprising edge test results for each extendeddiamond edge of the diamond test area within the pixel and for samplepositions at each of the start point and end point of the line, whereinan extended diamond edge is coincident with an edge of the diamond testarea and extends beyond the diamond points that the edge connects andwherein an edge test result for an edge and a sample position indicateswhether the sample position is to the left of the edge or to the rightof the edge or exactly on the edge; and generating a second set of edgetest results, the second set of edge test results comprising edge testresults for the line and for sample positions corresponding to eachpoint of each diamond test area for each pixel; wherein determining ifthe extended line passing through the start and end point has a slopeless than −1 or greater than +1 and touches the right point of thediamond test area comprises: receiving a first subset of the second setof edge test results and determining from the results in the firstsubset and for each pixel, if the extended line has a slope less than −1or greater than +1 and touches the right point of the diamond test area;wherein determining if the extended line touches the bottom point of thediamond test area comprises: receiving a second subset of the second setof edge test results and determining from the results in the secondsubset and for each pixel, if the extended line touches the bottom pointof the diamond test area; and wherein determining if the extended lineis on a same side of each point of the diamond test area comprises:receiving a third subset of the second set of edge test results anddetermining from the results in the third subset and for each pixel, ifthe extended line is on a same side of each point of the diamond testarea.
 6. The method according to claim 6, wherein the extended diamondedges of the diamond test area within a pixel comprise two pairs ofparallel edges and generating, for each pixel, the first set of edgetest results, comprises: generating an edge test results for a firstedge in each pair of parallel edges and for sample positions at each ofthe start point and end point of the line; and calculating edge testresults for a second edge in each pair of parallel edges by invertingthe results for the first edges in each pair and adding a constant toeach inverted result.
 7. The method according to claim 5, wherein themethod is implemented substantially in parallel for each pixel in aninput set of pixels and wherein generating a second set of edge testresults, the second set of edge test results comprising edge testresults for the line and for sample positions corresponding to eachpoint of each diamond test area for each pixel comprises: generating asecond set of edge test results, the second set of edge test resultscomprising edge test results for the line and for sample positionscorresponding to each point of each diamond test area for each pixel inthe input set of pixels.
 8. The method according to claim 1, furthercomprising: in response to determining that (a) the extended line doesnot both touch the right point of the diamond test area and have a slopeless than −1 or greater than +1, (b) that the extended line does nottouch the bottom point of the diamond test area, and (c) that theextended line is on a same side of each point of the diamond test area,not including the pixel as part of the line when rasterizing the line.9. The method according to claim 1, further comprising, for each pixelto be drawn as part of the line: determining whether the pixel is insidea bounding box of the line; and in response to determining that thepixel is outside the bounding box of the line, removing the pixel fromthe pixels to be drawn as part of the line.
 10. The method according toclaim 1, further comprising, for each pixel in the pixels to be drawn aspart of the line: generating a bounding box of the line; and generatingthe input pixels by adding to an input set of pixels, each pixel in thebounding box of the line.
 11. A graphics processing pipeline comprisinga rasterization phase, the rasterization phase comprising hardware logicarranged to: a) determine whether an extended line passing through thestart and end point has a slope less than −1 or greater than +1 andtouches the right point of a diamond test area within the pixel; b)determine whether the extended line touches the bottom point of thediamond test area; c) determine whether the extended line is not on asame side of each point of the diamond test area; and in response todetermining that at least one of a), b) or c) is true, add the pixel tobe drawn as part of the line.
 12. The graphics processing pipelineaccording to claim 11, wherein the hardware logic comprises: first edgetest hardware logic arranged to generate, for each pixel, edge testresults for each extended diamond edge of a diamond test area within thepixel and for sample positions at each of the start point and end pointof the line, wherein an extended diamond edge is coincident with an edgeof the diamond test area and extends beyond the diamond points that theedge connects and wherein an edge test result for an edge and a sampleposition indicates whether the sample position is to the left of theedge or to the right of the edge or exactly on the edge; second edgetest hardware logic arranged to generate edge test results for the lineand for sample positions corresponding to each point of each diamondtest area for each pixel; first test hardware logic arranged to receivea first subset of the edge test results from the first edge testhardware logic and to determine from the received results and for apixel, if the end point of the line is in the diamond test area withinthe pixel; second test hardware logic arranged to receive a first subsetof the edge test results from the second edge test hardware logic, todetermine from the received results and for each pixel, if the extendedline has a slope less than −1 or greater than +1 and touches the rightpoint of the diamond test area and to output a result indicating whetherthe extended line has a slope less than −1 or greater than +1 andtouches the right point of the diamond test area; third test hardwarelogic arranged to receive a second subset of the edge test results fromthe second edge test hardware logic, to determine from the receivedresults and for each pixel, if the extended line touches the bottompoint of the diamond test area and to output a result indicating whetherthe extended line touches the bottom point of the diamond test area;fourth test hardware logic arranged to receive a third subset of theedge test results from the second edge test hardware logic, to determinefrom the received results and for each pixel, if the extended line is ona same side of each point of the diamond test area and to output aresult indicating whether the extended line is on a same side of eachpoint of the diamond test area; and further hardware logic arranged toreceive outputs from the second, third and fourth test hardware logicand to determine from the outputs and for each pixel, if the extendedline has a slope less than −1 or greater than +1 and touches the rightpoint of the diamond test area or that the extended line touches thebottom point of the diamond test area or that the extended line is noton a same side of each point of the diamond test area and in response todetermining that an extended line passing through the start and endpoint is on a lower-left or lower-right edge of the diamond test area orthat the extended line has a slope less than −1 or greater than +1 andtouches the right point of the diamond test area or that the extendedline touches the bottom point of the diamond test area or that theextended line is not on a same side of each point of the diamond testarea, to add the pixel to be drawn as part of the line.
 13. The graphicsprocessing pipeline according to claim 11, wherein the hardware logic isfurther arranged to perform normal or conservative rasterization. 14.The graphics processing pipeline according to claim 13, wherein thehardware logic comprises: a first instance of edge test hardwarearranged, for normal or conservative rasterization, to perform edge testcalculations in relation to a first edge of a primitive; a secondinstance of edge test hardware arranged, for normal or conservativerasterization, to perform edge test calculations in relation to a secondedge of a primitive and for line rasterization to determine whether theend point and/or the start point of the line is in a diamond test areawithin the pixel, in response to determining that the end point is notin the diamond test area and the start point of the line is in thediamond test area, to add the pixel to be drawn as part of the line, andin response to determining that neither the start point nor the endpoint of the line are in the diamond test area, determine if the linecrosses more than one extended diamond edge; a third instance of edgetest hardware arranged, for normal or conservative rasterization, toperform edge test calculations in relation to a third edge of aprimitive; and wherein either the first or third instance of edge testhardware is arranged, for line rasterization, in response to determiningthat the line crosses more than one extended diamond edge: to determineif an extended line passing through the start and end point has a slopeless than −1 or greater than +1 and touches the right point of thediamond test area; to determine if the extended line touches the bottompoint of the diamond test area; to determine if the extended line is ona same side of each point of the diamond test area; and in response todetermining that the extended line has a slope less than −1 or greaterthan +1 and touches the right point of the diamond test area or that theextended line touches the bottom point of the diamond test area or thatthe extended line is not on a same side of each point of the diamondtest area, to add the pixel to be drawn as part of the line.
 15. Thegraphics processing pipeline according to claim 14, wherein theprimitive is a parallelogram and the second instance of edge testhardware is arranged, for normal or conservative rasterization, toperform edge test calculations in relation to both the second edge of aprimitive and a fourth edge of the primitive, wherein the second andfourth edges are parallel edges.
 16. The graphics processing pipelineaccording to claim 11, wherein the hardware logic further comprises: ahardware logic block comprising an input for receiving coordinates ofthe start point and the end point and hardware logic arranged tocalculate coefficients of the extended line based on coordinates of thestart point and the end point.
 17. A non-transitory computer readablestorage medium having stored thereon computer executable code that whenexecuted causes a GPU to rasterize, for each pixel in an input set ofpixels, a line having a start point and an end point, by: a) determiningwhether an extended line passing through the start and end point has aslope less than −1 or greater than +1 and touches the right point of adiamond test area within the pixel; b) determining whether the extendedline touches the bottom point of the diamond test area; c) determiningwhether the extended line is not on a same side of each point of thediamond test area; and in response to determining that at least one ofa), b) or c) is true, adding the pixel to be drawn as part of the line.